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(54) STEREOSCOPIC VIDEO IMAGE RECEIVER AND STEREOSCOPIC VIDEO IMAGE SYSTEM 

(57)Abstract: 

PROBLEM TO BE SOLVED: To generate a stereoscopic video image by 
decreasing the cost of the stereoscopic receiver and processing the 
received video information in real time. 

SOLUTION: A stereoscopic video image transmitter 101 in the 
processing of generating a stereoscopic video image from a 2- 
dimensional video image conducts pre-processing that extracts 
additional information such as each pixel or a depth of each pixel in the 
2-dimensional video image required to generate the stereoscopic video 
image from the 2-dimensional video image, and sends a coded signal 
consisting of the additional information obtained by the pre- processing 
and the 2-dimensional video image to a stereoscopic video image 
receiver 105. The stereoscopic video image receiver 105 receives the 
transmitted signal to decode respectively the 2-dimensional video 
image and the additional information and generates a stereoscopic 
video image by using parallax information based on the decoded 2- 
dimensional video image and the decoded additional information. 
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* NOTICES * 



JPO and NCI PI are not responsible for any 
damages caused by the use of this translation. 

1. This document has been translated by computer. So the translation may not reflect the original precisely. 

2. **** shows the word which can not be translated. 
3.1n the drawings, any words are not translated. 

CLAIMS 



[Claim(s)] 

[Claim 1] In the 3-dimensional scenography receiving set which generates 3-dimensional scenography based on 
the image information transmitted from the sending set A receiving means to receive the signal with which the 
additional information for generating the two-dimensional image information and the 3-dimensional scenography 
which are transmitted from said sending set was encoded, A decryption means to decrypt said two-dimensional 
image information and said additional information from the signal received by this receiving means, respectively, 
The 3-dimensional scenography receiving set characterized by providing a generation means to generate the 3- 
dimensional scenography using parallax information, using said two-dimensional image information with which it 
was decrypted by this decryption means, and said additional information. 

[Claim 2] Said additional information is a 3-dimensional scenography receiving set according to claim 1 
characterized by being a depth value corresponding to each pixel of a two-dimensional image. 
[Claim 3] The 3-dimensional scenography receiving set according to claim 2 characterized by having a means to 
change said depth value based on the value specified as an observer on the basis of said depth value. 
[Claim 4] In the 3-dimensional scenography receiving set which generates 3-dimensional scenography based on 
the image information transmitted from a sending set The two-dimensional image information which is 
transmitted from said sending set and which was divided for every element, A receiving means to receive the 
signal with which the positional information which shows the physical relationship between each element, and 
the depth value of each element were encoded, A decryption means to decrypt said two-dimensional image 
information, positional information, and a depth value from the signal received with this receiving means, 
respectively, The 3-dimensional scenography receiving set characterized by providing a generation means to 
compound each element and to generate the 3-dimensional scenography using parallax information, based on 
said two-dimensional image information for every element decrypted with this decryption means, the positional 
information between each element, and the depth value for every element. 

[Claim 5] The 3-dimensional scenography receiving set according to claim 4 characterized by having a means to 
change said depth value based on the value specified as an observer on the basis of said depth value received 
with said receiving means. 

[Claim 6] In the solid visual system which consists of a sending set which transmits image information, and a 
receiving set which receives the image information transmitted from this sending set, and generates 3- 
dimensional scenography said sending set While encoding a pretreatment means to generate the additional 
information forjudging the three-dimension configuration of an image from a two-dimensional video signal as 
pretreatment, and generating 3-dimensional scenography, and said two-dimensional video signal A coding means 
to encode the additional information generated by said pretreatment means, It consists of a transmitting means 
to transmit the signal which encoded a two-dimensional video signal and additional information with this coding 
means to said receiving set. Said receiving set A decryption means to decrypt said two-dimensional video signal 
and additional information, respectively from the signal received with a receiving means to receive the signal 
transmitted from said sending set, and this receiving means, The solid visual system characterized by consisting 
of a generation means to generate the 3-dimensional scenography using parallax information, based on said 
two-dimensional video signal and additional information which were decrypted with this decryption means. 
[Claim 7] In the solid visual system which consists of a sending set which transmits image information, and a 
receiving set which receives the image information transmitted from this sending set, and generates 3- 
dimensional scenography said sending set While encoding the two-dimensional video signal for every element 
which divided the two-dimensional video signal for every element according to the contents of the image as 
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pretreatment, and was divided ^^p-etreatment means to judge the deptH^^Pe of each element, and this 
pretreatment means, and the positional information between each elementwith a coding means to encode the 
depth value for every element, and this coding means, the two-dimensional video signal for said every element, 
It consists of a transmitting means to transmit the signal which encoded the positional information between 
each element, and the depth value for every element. Said receiving set A decryption means to decrypt the 
two-dimensional video signal for said every element, the positional information between each element, and the 
depth value for every element, respectively from the signal received with a receiving means to receive the 
signal transmitted from said sending set, and this receiving means. The solid visual system characterized by 
consisting of a generation means to compound each element and to generate the 3-dimensional scenography 
using parallax information, based on the two-dimensional video signal for said every element decrypted with this 
decryption means, the positional information between each element, and the depth value for every element. 



[Translation done.] 
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* NOTICES * 

JPO and NCI PI are not responsible for any 
damages caused by the use of this translation. 

LThis document has been translated by computer. So the translation may not reflect the original precisely. 
2.**** shows the word which can not be translated. 
3.1n the drawings, any words are not translated. 

DETAILED DESCRIPTION 



[Detailed Description of the Invention] 
[0001] 

[Field of the Invention] This invention relates to the 3-dimensional scenography receiving set and solid visual 
system which generate 3-dimensional scenography based on additional information and two-dimensional video 
signals, such as depth information extracted from the two-dimensional video signal. 
[0002] 

[Description of the Prior Art] Conventionally, while development of a noncommercial solid graphic display device 
(solid display) progresses, there are few amounts of the noncommercial software for 3-dimensional 
scenography, and in order to display 3-dimensional scenography, it is necessary to make 3-dimensional 
scenography newly. 

[0003] Then, what changes the conventional two-dimensional image into 3-dimensional scenography is 
proposed as an approach of harnessing the property of the conventional two-dimensional image. The 
configuration which changes a two-dimensional image into a 3D scenography is shown in drawing 10 . By the 
same approach as transmitting a two-dimensional video signal, a sending set 1001 encodes a two-dimensional 
video signal with coding equipment 1002, and is transmitted. After a receiving set 1003 receives the signal 
encoded from the sending set 1001 and decrypts a two-dimensional video signal with decryption equipment 
1004, it generates two parallax video signals with the parallax image generation vessel 1005. That is, all the 
transform processing to 3-dimensional scenography is taken charge of with a receiving set 1003. 
[0004] As an approach of generating 3-dimensional scenography, it assumes, for example that it is what always 
has near the core of a screen in this side in the depth direction, and there is the approach of generating 3- 
dimensional scenography so that near a core can always be seen to the front. 

[0005] moreover, the image which assumes to be what has the field currently moved as other approaches in 
this side, and a motion gives from the left at a right eye in the case of the right — a left eye — receiving — 
number field ****** — ** — it is carrying out and the field currently moved will be perceived to the front. 
[0006] Research which current and the above image processing techniques accomplish development, and are 
not concerned with an animation and a still picture, but presumes a three-dimension configuration from a two- 
dimensional image is done briskly. On the other hand, standardization of MPEG4 is made about coding of an 
image. The configuration is shown in drawing 1 1 . First, a two-dimensional image as shown in drawing 12 (a) is 
inputted into a sending set 101 1. In an arithmetic unit 1012, as shown in drawing 12 (b) - (d), while 
disassembling an image into a component and encoding each element with coding equipment 1013, the 
information on the physical relationship between each element is also encoded. Multiplex [ of each encoded 
signal ] is carried out with multiplexer 1014, and it is transmitted from a sending set (transmission). It is 
received by the receiving set 1015 and this transmitted signal is first separated for every element by the 
demultiplexer 1016. Each separated element is decrypted by decryption equipment 1017, and also decrypts the 
information on the physical relationship between elements to coincidence. As shown in drawin g 12 (e), the 
decrypted signal is compounded based on the information on physical relationship in the synthetic section 1018, 
and is outputted as a subject-copy image and same image. 
[0007] 

[Problem(s) to be Solved by the Invention] However, the output of the equipment of above-mentioned drawin g 
J_l is a two-dimensional image to the last. Then, when a 3-dimensional scenography receiving set tends to 
perform altogether processing which changes a two-dimensional image into the 3D scenography which has real 
nature more, there is a fault that a 3-dimensional scenography receiving set serves as a large sum, or it 
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becomes and difficult to proce^Jfroughput on real time. 

[0008] Then, this invention can lower the cost of a 3-dimensional scenography receiving set, and aims at 
offering the 3-dimensional scenography receiving set and solid visual system which can process the received 
image information on real time, and can generate 3-dimensional scenography. 
[0009] 

[Means for Solving the Problem] In order to solve the above-mentioned technical problem, the 3-dimensional 
scenography receiving set of this invention In what generates 3-dimensional scenography based on the image 
information transmitted from the sending set A receiving means to receive the signal with which the additional 
information for generating the two-dimensional image information and the 3-dimensional scenography which are 
transmitted from said sending set was encoded, It constitutes from a generation means to generate the 3- 
dimensional scenography using parallax information, using a decryption means to decrypt said two-dimensional 
image information and additional information from the signal received by this receiving means, respectively, and 
said two-dimensional image information, with which it was decrypted by this decryption means and said 
additional information. 

[0010] Since the additional information which did not need to make anew the information for making 3- 
dimensional scenography using two-dimensional image information from the receiving set, and has been sent by 
the above-mentioned means since two-dimensional image information and additional information are sent from 
a sending set can be used immediately, signal processing in a receiving set can be simplified greatly. 
[0011] 

[Embodiment of the Invention] Hereafter, the gestalt of implementation of this invention is explained with 
reference to a drawing. Drawin g 1 is a block diagram for explaining the gestalt of implementation of the 1st of 
this invention. This solid visual system consists of a sending set (3-dimensional scenography sending set) which 
transmits image information, and a receiving set (3-dimensional scenography receiving set) which receives the 
image information transmitted from this sending set, and displays that image. For example, as a sending set, 
they are the sending set in the broadcasting statipn of television, a video camera, or a video tape recorder 
(VTR). As a receiving set which receives and displays the image information transmitted from such a sending 
set, there are three dimentional display equipments, such as a solid display. 

[0012] As shown in drawin g 1 , an arithmetic unit 102, coding equipment 103, and multiplexer 104 constitute the 
sending set 101 of a transmitting side, and it constitutes the receiving set 105 of a receiving side with a 
demultiplexer 106, decryption equipment 107, and parallax image generation equipment 108. 
[0013] First, the two-dimensional video signal which should be transmitted to the arithmetic unit 102 in the 
sending set 101 of a transmitting side is inputted. In an arithmetic unit 102, pretreatment required for 
generation of 3-dimensional scenography is performed, and additional information effective in generation of 3- 
dimensional scenography is computed. As useful information, the three-dimension configuration directly 
presumed from the two-dimensional image is in generation of 3-dimensional scenography, and this shows with 
the depth value over each pixel of for example, a two-dimensional image. 

[0014] There is how a three-dimension configuration is extracted [ much ] from a two-dimensional image in the 
field of current and a computer vision, and research is done. Since the two-dimensional video signal which 
changes in the direction of a time-axis is inputted into the above-mentioned arithmetic unit 102 of the gestalt 
of this 1st operation, as a parameter when presuming a three-dimension configuration, it is arrangement, 
magnitude, etc. of motion parallax, shading, the inclination of a texture, line distance, the two-dimensional 
configuration of an element, the lap between elements, and an element. Furthermore, there is a variation rate 
between motion vectors etc. as a parameter when presuming motion parallax, the two-dimensional configuration 
of an element, and the lap between elements, and the above-mentioned parameter relates each other to each 
other. Therefore, all the above-mentioned parameters will correspond besides a depth value as additional 
information effective in 3-dimensional scenography generation. That is, the parameter of a three-dimension 
element indispensable in order to generate 3-dimensional scenography judges with a sending set 101, and 
transmits to a receiving set 105. 

[0015] Although the operation of the additional information involved in presumption of the three-dimension 
configuration from the above two-dimensional images needs many processings, with the gestalt of this 1st 
operation, in order to give these processings to a transmitting side, real time nature is not needed in many 
cases, and the high arithmetic unit of capacity is used at a transmitting side compared with a receiving side. For 
this reason, generation of 3-dimensional scenography with a more high precision is attained. Therefore, although 
it is desirable to presumption of the three-dimension configuration from a two-dimensional image for the 
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operation which gave [ above-r^P^ioned.] explanation to generate autorr^B^ally, people double with a two- 
dimensional image manually, depth is set to it, and it is good for it also considering it as additional information. 
[0016] Moreover, the additional information generated with the above-mentioned arithmetic unit 102 is encoded 
with coding equipment 103 with two-dimensional image information. These two encoded signals are multiplexed 
with multiplexer 104, and are transmitted to a receiving set 105 from a sending set 101. 

[0017] And a receiving set 105 receives the signal transmitted from the sending set 101. First, it separates into 
a two-dimensional video signal and an additional information signal by the demultiplexer 106 of a receiving set 
105, and this signal decrypts each signal with decryption equipment 107. The decrypted signal is offered to an 
observer's separate eye, after inputting into parallax image generation equipment 108, performing after 
treatment required for generation of 3-dimensional scenography and generating two parallax images. 
[0018] Drawin g 2 shows the conceptual diagram at the time of using the depth value for every pixel as 
additional information. Drawing 2 (a) As shown in - (d), the two-dimensional video signal 1 1 1 inputted into a 
sending set 101 and the depth value (additional information) 1 12 for every pixel computed from this two- 
dimensional video signal 1 1 1 are transmitted to a receiving set 105. In a receiving set 105, the parallax image 
(left image) 113 and the parallax image (right image) 114 which were generated from the received information 
are outputted outside. 

[0019] Moreover, when the depth value for every pixel is known, it can be uniquely found from the magnitude of 
a screen, an observer's observation distance, and the distance between both eyes into which location of a 
parallax image each pixel of a two-dimensional image is changed. 

[0020] For example, as shown in drawing 3 , the pixel of the original location of a two-dimensional image is 
moved to the pixel location for right eyes, and the pixel location for left eyes according to the depth value, and 
geometry is asked for a location. Moreover, in case a two-dimensional image is changed into a parallax image, it 
is necessary to generate the pixel which does not exist in a two-dimensional image. In this case, for example, it 
interpolates and generates from a contiguity pixel. 

[0021] As stated above, in case 3-dimensional scenography is generated from a two-dimensional image, a 
transmitting side and a receiving side share processing. The high depth value of a processing load is judged by 
the transmitting side on non-real time, and a parallax image is generated by the receiving side from the depth 
value which needs to be changed with the configuration and observation distance of a display. 
[0022] Thereby, generation of 3-dimensional scenography with a high precision in a receiving set is attained on 
real time. Moreover, since a depth value is sent as additional information of the usual two-dimensional video 
signal, television which can display only a two-dimensional image can maintain compatibility. 
[0023] Next, drawing 4 is a block diagram for explaining the gestalt of implementation of the 2nd of this 
invention. First, the two-dimensional video signal which should be transmitted to the arithmetic unit 202 in the 
sending set 201 of a transmitting side is inputted. In an arithmetic unit 202, pattern recognition of the inputted 
two-dimensional video signal is carried out, and field division of the screen of an image is carried out for every 
element. The technique of field division is for example, an image processing handbook (Showa 62 issue **** 
Co.) etc., and various technique is introduced. Moreover, in an arithmetic unit 202, processing which presumes 
the depth for every element is performed to coincidence. 

[0024] The video signal decomposed for every element with the arithmetic unit 202 is encoded for every 
element with coding equipment 203. In addition, although this coding equipment 203 is shown that two or more 
coding processing sections are arranged inside, and it processes to juxtaposition as shown in drawing 4 , it may 
process two or more elements in the one coding processing section sequentially. 

[0025] Coding equipment 203 also encodes the physical relationship information by what kind of arrangement 
each element is compounded while encoding each element. The scene description language (BIFS:Binary Format 
for Scenes) standardized by MPEG4 may be used for the technique of this coding. 
[0026] Moreover, coding equipment 203 also encodes the depth value for every element as additional 
information. An example of this coding format is shown in drawin g 5 . As shown in drawin g 5 , an index (index) 
also stores the value which shows the die length of a signal by the identification code which shows that the 
signal is the information on a depth value. An identifier (name) is the identification code for every element. The 
depth (depth) shows the depth value of the element. Here, although the information on a depth value is treated 
as additional information other than positional information, since arrangement of the three-dimension space of 
each element can be specified also by the above-mentioned scene description language, a depth value may be 
specified within a scene description language. 

[0027] And after multiplexing each signal encoded with coding equipment 203 with multiplexer 204 and changing 
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it into a transmission format, it^Pransmitted from transmission equipmei^lBl (transmission). 
[0028] The demultiplexer 206 in a receiving set 205 receives, and this transmitted signal is separated for each 
[ which is multiplexed ] element of every. Each of this separated element is decrypted in decryption equipment 
207, and also decrypts the additional information which shows physical relationship and the depth value of each 
element to coincidence. 

[0029] As shown in drawin g 4 , decryption equipment 207 arranges two or more decryption processing sections 

inside, processes a decryption to juxtaposition, but as long as it can process this processing within real time, it 

may process two or more elements in the one decryption processing section sequentially. 

[0030] After being inputted into the parallax image generation machine 208, and parallax image generation 

equipment's 208 performing after treatment required for 3-dimensional scenography generation and generating 

two parallax images, an observer's separate eye is provided with the signal decrypted by decryption equipment 

207. 

[0031] Drawing 6 is a conceptual diagram at the time of using the depth value for every element as additional 
information. That is, to the subject-copy image 221 of a two-dimensional image, as shown in drawin g 6 (a), as 
shown in drawin g 6 R> 6 (b) - (d), it divides into the image 222 of the background in the subject-copy image 
221, the image 223 of a house, and the image 224 of a bus. As depth information on each of this divided 
element and each element, a depth value is determined from the back in order of the image 222 of a 
background, the image 223 of a house, and the image 224 of a bus. And a subject-copy image encodes the 
physical relationship information on each of these elements, and the depth value of each element, and it 
transmits to a receiving set. A receiving set decrypts the two-dimensional video signal encoded and 
transmitted, the physical relationship information between each element, and the depth value of each element, 
and based on those information, as shown in drawing 6 (e) and (f), it generates the right image 225 as a parallax 
image, and the left image 226. 

[0032] Next, in case a parallax image is changed, the interpolation approach of a field without the pixel to 
produce is shown in drawing 7 . That is, in order to take out a steric effect, when a photographic subject image 
is rotated or moved and a stereoscopic model is created, the original two-dimensional screen product and the 
area for a stereoscopic model are not necessarily in agreement. For this reason, the non-picture area which 
does not have a pixel to a background image may arise. 

[0033] For example, as shown in drawin g 7 (a) and (b), a receiving set receives each element disassembled into 
the image 232 of a background, and the image 233 of a house to the subject-copy image 231 with the sending 
set, and a parallax image is generated according to the image and depth value of each of that element. Then, as 
shown in drawing 7 (c), a non-picture area without a pixel may occur. In this case, as shown in drawing 7 (d), it 
interpolates by expanding the image 233 of a house by space shaft orientations. Moreover, it is also possible to 
interpolate a field without a pixel from a contiguity pixel. 

[0034] Drawin g 8 is a block diagram for explaining the gestalt of implementation of the 3rd of this invention. In 
addition, the same part as drawing 1 attaches the same sign, and omits the explanation. With the gestalt of this 
operation, as shown in drawin g 8 , in a receiving set 301, parallax image generation equipment 302 generates 3- 
dimensional scenography based on the parallax information which the user inputted with the parallax adjustment 
input unit 303 which an observer adjusts. It enables this to offer the 3-dimensional scenography doubled with 
liking of an observer. That is, a user can adjust parallax information and can adjust the feeling of depth of a 
stereoscopic model. 

[0035] Drawin g 9 (a) - (c) is drawing for explaining an example of parallax adjustment. As shown in draw ing 9 (a) t 
the depth value 312 is specified as a receiving set 301 by pretreatment of the arithmetic unit 102 of a sending 
set 101 to the screen side 311. 

[0036] And as shown in drawin g 9 (b), when a user adjusts the datum level of depth, a user inputs the 
adjustment value of the datum level of depth with the parallax adjustment input unit 303. Then, the parallax 
image generation equipment 302 of a receiving set 301 changes the datum level of a depth value into the depth 
value 313 from the depth value 312, and generates a parallax image. The depth value specified by the sending 
set 101 can be changed by this according to the adjustment value of the datum level of the depth which an 
observer inputs, the image generated can be changed into the image which is visible to the image which is 
visible to the whole at back, or this side, and the 3-dimensional scenography according to liking of an observer 
can be offered. 

[0037] Moreover, when a user adjusts the amount of depth, a user inputs the amount of adjustments of the 
amount of depth with the parallax adjustment input unit 303. Then, the parallax image generation equipment 302 
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of a receiving set 301 makes tH^pipth value 312 fluctuate according to fl^lmount of adjustments, is changed 
into the depth value 314, and generates a parallax image. The image which changes and generates by this the 
depth value specified by the sending set according to the amount of adjustments of the amount of depth which 
an observer inputs can be changed into the large image of depth perception, or the small image of depth 
perception, and the 3-dimensional scenography according to liking of an observer can be offered. 
[0038] above-mentioned the 1- as the gestalt of each 3rd operation explained, a 3-dimensional-scenography 
sending set performs pretreatment which extracts additional information, such as each pixel in a required two- 
dimensional image, or a depth value of each element, when generating 3-dimensional scenography from a two- 
dimensional image, and transmits it to a 3-dimensional-scenography receiving set in the processing which 
generates 3-dimensional scenography from a two-dimensional image as a signal which encoded the additional 
information acquired with this pretreatment, and a two-dimensional image. A 3-dimensional scenography 
receiving set receives the transmitted signal, decrypts a two-dimensional image and additional information, 
respectively, and generates the 3-dimensional scenography using parallax information based on the two- 
dimensional image and additional information which were decrypted. 

[0039] A sending set can perform processing which extracts by this additional information, such as presumption 
of a depth value which requires many operations, on non-real time, additional information with a high precision 
can be acquired, and with a receiving set, when image information is received, highly precise 3-dimensional 
scenography can be generated on real time. 

[0040] Moreover, since an observer can change additional information, such as a depth value, by the receiving 
side in case 3-dimensional scenography generates based on additional information, such as a depth value 
acquired by the transmitting side, offer of the 3-dimensional scenography doubled with liking of an observer is 
attained. 
[0041] 

[Effect of the Invention] As explained in full detail above, according to this invention, the cost of a 3- 
dimensional scenography receiving set can be lowered, the received image information can be processed on real 
time, and 3-dimensional scenography can be generated. 



[Translation done.] 
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